Search | VHL Regional Portal

The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species.

Putman, Tim E; Schaper, Kevin; Matentzoglu, Nicolas; Rubinetti, Vincent P; Alquaddoomi, Faisal S; Cox, Corey; Caufield, J Harry; Elsarboukh, Glass; Gehrke, Sarah; Hegde, Harshad; Reese, Justin T; Braun, Ian; Bruskiewich, Richard M; Cappelletti, Luca; Carbon, Seth; Caron, Anita R; Chan, Lauren E; Chute, Christopher G; Cortes, Katherina G; De Souza, Vinícius; Fontana, Tommaso; Harris, Nomi L; Hartley, Emily L; Hurwitz, Eric; Jacobsen, Julius O B; Krishnamurthy, Madan; Laraway, Bryan J; McLaughlin, James A; McMurry, Julie A; Moxon, Sierra A T; Mullen, Kathleen R; O'Neil, Shawn T; Shefchek, Kent A; Stefancsik, Ray; Toro, Sabrina; Vasilevsky, Nicole A; Walls, Ramona L; Whetzel, Patricia L; Osumi-Sutherland, David; Smedley, Damian; Robinson, Peter N; Mungall, Christopher J; Haendel, Melissa A; Munoz-Torres, Monica C.

Nucleic Acids Res ; 52(D1): D938-D949, 2024 Jan 05.

Article in English | MEDLINE | ID: mdl-38000386

ABSTRACT

Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.

Subject(s)

Databases, Factual , Disease , Genes , Phenotype , Humans , Internet , Databases, Factual/standards , Software , Genes/genetics , Disease/genetics

KG-Hub-building and exchanging biological knowledge graphs.

Caufield, J Harry; Putman, Tim; Schaper, Kevin; Unni, Deepak R; Hegde, Harshad; Callahan, Tiffany J; Cappelletti, Luca; Moxon, Sierra A T; Ravanmehr, Vida; Carbon, Seth; Chan, Lauren E; Cortes, Katherina; Shefchek, Kent A; Elsarboukh, Glass; Balhoff, Jim; Fontana, Tommaso; Matentzoglu, Nicolas; Bruskiewich, Richard M; Thessen, Anne E; Harris, Nomi L; Munoz-Torres, Monica C; Haendel, Melissa A; Robinson, Peter N; Joachimiak, Marcin P; Mungall, Christopher J; Reese, Justin T.

Bioinformatics ; 39(7)2023 07 01.

Article in English | MEDLINE | ID: mdl-37389415

ABSTRACT

MOTIVATION: Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS: Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION: https://kghub.org.

Subject(s)

Biological Ontologies , COVID-19 , Humans , Pattern Recognition, Automated , Rare Diseases , Machine Learning

Biolink Model: A universal schema for knowledge graphs in clinical, biomedical, and translational science.

Unni, Deepak R; Moxon, Sierra A T; Bada, Michael; Brush, Matthew; Bruskiewich, Richard; Caufield, J Harry; Clemons, Paul A; Dancik, Vlado; Dumontier, Michel; Fecho, Karamarie; Glusman, Gustavo; Hadlock, Jennifer J; Harris, Nomi L; Joshi, Arpita; Putman, Tim; Qin, Guangrong; Ramsey, Stephen A; Shefchek, Kent A; Solbrig, Harold; Soman, Karthik; Thessen, Anne E; Haendel, Melissa A; Bizon, Chris; Mungall, Christopher J.

Clin Transl Sci ; 15(8): 1848-1855, 2022 08.

Article in English | MEDLINE | ID: mdl-36125173

ABSTRACT

Within clinical, biomedical, and translational science, an increasing number of projects are adopting graphs for knowledge representation. Graph-based data models elucidate the interconnectedness among core biomedical concepts, enable data structures to be easily updated, and support intuitive queries, visualizations, and inference algorithms. However, knowledge discovery across these "knowledge graphs" (KGs) has remained difficult. Data set heterogeneity and complexity; the proliferation of ad hoc data formats; poor compliance with guidelines on findability, accessibility, interoperability, and reusability; and, in particular, the lack of a universally accepted, open-access model for standardization across biomedical KGs has left the task of reconciling data sources to downstream consumers. Biolink Model is an open-source data model that can be used to formalize the relationships between data structures in translational science. It incorporates object-oriented classification and graph-oriented features. The core of the model is a set of hierarchical, interconnected classes (or categories) and relationships between them (or predicates) representing biomedical entities such as gene, disease, chemical, anatomic structure, and phenotype. The model provides class and edge attributes and associations that guide how entities should relate to one another. Here, we highlight the need for a standardized data model for KGs, describe Biolink Model, and compare it with other models. We demonstrate the utility of Biolink Model in various initiatives, including the Biomedical Data Translator Consortium and the Monarch Initiative, and show how it has supported easier integration and interoperability of biomedical KGs, bringing together knowledge from multiple sources and helping to realize the goals of translational science.

Subject(s)

Pattern Recognition, Automated , Translational Science, Biomedical , Knowledge

Underrepresentation of Phenotypic Variability of 16p13.11 Microduplication Syndrome Assessed With an Online Self-Phenotyping Tool (Phenotypr): Cohort Study.

Li, Jianqiao; Hojlo, Margaret A; Chennuri, Sampath; Gujral, Nitin; Paterson, Heather L; Shefchek, Kent A; Genetti, Casie A; Cohn, Emily L; Sewalk, Kara C; Garvey, Emily A; Buttermore, Elizabeth D; Anderson, Nickesha C; Beggs, Alan H; Agrawal, Pankaj B; Brownstein, John S; Haendel, Melissa A; Holm, Ingrid A; Gonzalez-Heydrich, Joseph; Brownstein, Catherine A.

J Med Internet Res ; 23(3): e21023, 2021 03 16.

Article in English | MEDLINE | ID: mdl-33724192

ABSTRACT

BACKGROUND: 16p13.11 microduplication syndrome has a variable presentation and is characterized primarily by neurodevelopmental and physical phenotypes resulting from copy number variation at chromosome 16p13.11. Given its variability, there may be features that have not yet been reported. The goal of this study was to use a patient "self-phenotyping" survey to collect data directly from patients to further characterize the phenotypes of 16p13.11 microduplication syndrome. OBJECTIVE: This study aimed to (1) discover self-identified phenotypes in 16p13.11 microduplication syndrome that have been underrepresented in the scientific literature and (2) demonstrate that self-phenotyping tools are valuable sources of data for the medical and scientific communities. METHODS: As part of a large study to compare and evaluate patient self-phenotyping surveys, an online survey tool, Phenotypr, was developed for patients with rare disorders to self-report phenotypes. Participants with 16p13.11 microduplication syndrome were recruited through the Boston Children's Hospital 16p13.11 Registry. Either the caregiver, parent, or legal guardian of an affected child or the affected person (if aged 18 years or above) completed the survey. Results were securely transferred to a Research Electronic Data Capture database and aggregated for analysis. RESULTS: A total of 19 participants enrolled in the study. Notably, among the 19 participants, aggression and anxiety were mentioned by 3 (16%) and 4 (21%) participants, respectively, which is an increase over the numbers in previously published literature. Additionally, among the 19 participants, 3 (16%) had asthma and 2 (11%) had other immunological disorders, both of which have not been previously described in the syndrome. CONCLUSIONS: Several phenotypes might be underrepresented in the previous 16p13.11 microduplication literature, and new possible phenotypes have been identified. Whenever possible, patients should continue to be referenced as a source of complete phenotyping data on their condition. Self-phenotyping may lead to a better understanding of the prevalence of phenotypes in genetic disorders and may identify previously unreported phenotypes.

Subject(s)

DNA Copy Number Variations , Family , Biological Variation, Population , Cohort Studies , Humans , Phenotype

KG-COVID-19: A Framework to Produce Customized Knowledge Graphs for COVID-19 Response.

Reese, Justin T; Unni, Deepak; Callahan, Tiffany J; Cappelletti, Luca; Ravanmehr, Vida; Carbon, Seth; Shefchek, Kent A; Good, Benjamin M; Balhoff, James P; Fontana, Tommaso; Blau, Hannah; Matentzoglu, Nicolas; Harris, Nomi L; Munoz-Torres, Monica C; Haendel, Melissa A; Robinson, Peter N; Joachimiak, Marcin P; Mungall, Christopher J.

Patterns (N Y) ; 2(1): 100155, 2021 Jan 08.

Article in English | MEDLINE | ID: mdl-33196056

ABSTRACT

Integrated, up-to-date data about SARS-CoV-2 and COVID-19 is crucial for the ongoing response to the COVID-19 pandemic by the biomedical research community. While rich biological knowledge exists for SARS-CoV-2 and related viruses (SARS-CoV, MERS-CoV), integrating this knowledge is difficult and time-consuming, since much of it is in siloed databases or in textual format. Furthermore, the data required by the research community vary drastically for different tasks; the optimal data for a machine learning task, for example, is much different from the data used to populate a browsable user interface for clinicians. To address these challenges, we created KG-COVID-19, a flexible framework that ingests and integrates heterogeneous biomedical data to produce knowledge graphs (KGs), and applied it to create a KG for COVID-19 response. This KG framework also can be applied to other problems in which siloed biomedical data must be quickly integrated for different research applications, including future pandemics.

The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species.

Shefchek, Kent A; Harris, Nomi L; Gargano, Michael; Matentzoglu, Nicolas; Unni, Deepak; Brush, Matthew; Keith, Daniel; Conlin, Tom; Vasilevsky, Nicole; Zhang, Xingmin Aaron; Balhoff, James P; Babb, Larry; Bello, Susan M; Blau, Hannah; Bradford, Yvonne; Carbon, Seth; Carmody, Leigh; Chan, Lauren E; Cipriani, Valentina; Cuzick, Alayne; Della Rocca, Maria; Dunn, Nathan; Essaid, Shahim; Fey, Petra; Grove, Chris; Gourdine, Jean-Phillipe; Hamosh, Ada; Harris, Midori; Helbig, Ingo; Hoatlin, Maureen; Joachimiak, Marcin; Jupp, Simon; Lett, Kenneth B; Lewis, Suzanna E; McNamara, Craig; Pendlington, Zoë M; Pilgrim, Clare; Putman, Tim; Ravanmehr, Vida; Reese, Justin; Riggs, Erin; Robb, Sofia; Roncaglia, Paola; Seager, James; Segerdell, Erik; Similuk, Morgan; Storm, Andrea L; Thaxon, Courtney; Thessen, Anne; Jacobsen, Julius O B.

Nucleic Acids Res ; 48(D1): D704-D715, 2020 01 08.

Article in English | MEDLINE | ID: mdl-31701156

ABSTRACT

In biology and biomedicine, relating phenotypic outcomes with genetic variation and environmental factors remains a challenge: patient phenotypes may not match known diseases, candidate variants may be in genes that haven't been characterized, research organisms may not recapitulate human or veterinary diseases, environmental factors affecting disease outcomes are unknown or undocumented, and many resources must be queried to find potentially significant phenotypic associations. The Monarch Initiative (https://monarchinitiative.org) integrates information on genes, variants, genotypes, phenotypes and diseases in a variety of species, and allows powerful ontology-based search. We develop many widely adopted ontologies that together enable sophisticated computational analysis, mechanistic discovery and diagnostics of Mendelian diseases. Our algorithms and tools are widely used to identify animal models of human disease through phenotypic similarity, for differential diagnostics and to facilitate translational research. Launched in 2015, Monarch has grown with regards to data (new organisms, more sources, better modeling); new API and standards; ontologies (new Mondo unified disease ontology, improvements to ontologies such as HPO and uPheno); user interface (a redesigned website); and community development. Monarch data, algorithms and tools are being used and extended by resources such as GA4GH and NCATS Translator, among others, to aid mechanistic discovery and diagnostics.

Subject(s)

Computational Biology/methods , Genotype , Phenotype , Algorithms , Animals , Biological Ontologies , Databases, Genetic , Exome , Genetic Association Studies , Genetic Variation , Genomics , Humans , Internet , Software , Translational Research, Biomedical , User-Computer Interface

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL